Dataset statistics
| Number of variables | 34 |
|---|---|
| Number of observations | 100000 |
| Missing cells | 100031 |
| Missing cells (%) | 2.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 26.7 MiB |
| Average record size in memory | 280.0 B |
Variable types
| Numeric | 4 |
|---|---|
| Categorical | 23 |
| Text | 4 |
| DateTime | 2 |
| Unsupported | 1 |
user_id is highly overall correlated with sex and 1 other fields | High correlation |
movie_id is highly overall correlated with unknown and 19 other fields | High correlation |
unknown is highly overall correlated with movie_id and 1 other fields | High correlation |
Action is highly overall correlated with movie_id and 1 other fields | High correlation |
Adventure is highly overall correlated with movie_id and 1 other fields | High correlation |
Animation is highly overall correlated with movie_id and 2 other fields | High correlation |
Children's is highly overall correlated with movie_id and 2 other fields | High correlation |
Comedy is highly overall correlated with movie_id and 1 other fields | High correlation |
Crime is highly overall correlated with movie_id and 1 other fields | High correlation |
Documentary is highly overall correlated with movie_id and 1 other fields | High correlation |
Drama is highly overall correlated with movie_id and 1 other fields | High correlation |
Fantasy is highly overall correlated with movie_id and 1 other fields | High correlation |
Film-Noir is highly overall correlated with movie_id and 1 other fields | High correlation |
Horror is highly overall correlated with movie_id and 1 other fields | High correlation |
Musical is highly overall correlated with movie_id and 1 other fields | High correlation |
Mystery is highly overall correlated with movie_id and 1 other fields | High correlation |
Romance is highly overall correlated with movie_id and 1 other fields | High correlation |
Sci-Fi is highly overall correlated with movie_id and 1 other fields | High correlation |
Thriller is highly overall correlated with movie_id and 1 other fields | High correlation |
War is highly overall correlated with movie_id and 1 other fields | High correlation |
Western is highly overall correlated with movie_id and 1 other fields | High correlation |
genre is highly overall correlated with movie_id and 19 other fields | High correlation |
sex is highly overall correlated with user_id | High correlation |
occupation is highly overall correlated with user_id | High correlation |
unknown is highly imbalanced (99.9%) | Imbalance |
Animation is highly imbalanced (77.6%) | Imbalance |
Children's is highly imbalanced (62.7%) | Imbalance |
Crime is highly imbalanced (59.6%) | Imbalance |
Documentary is highly imbalanced (93.6%) | Imbalance |
Fantasy is highly imbalanced (89.7%) | Imbalance |
Film-Noir is highly imbalanced (87.4%) | Imbalance |
Horror is highly imbalanced (70.0%) | Imbalance |
Musical is highly imbalanced (71.6%) | Imbalance |
Mystery is highly imbalanced (70.3%) | Imbalance |
War is highly imbalanced (55.0%) | Imbalance |
Western is highly imbalanced (86.7%) | Imbalance |
video_release_date has 100000 (100.0%) missing values | Missing |
video_release_date is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2023-11-06 22:05:03.825065 |
|---|---|
| Analysis finished | 2023-11-06 22:05:37.017695 |
| Duration | 33.19 seconds |
| Software version | ydata-profiling vv4.6.1 |
| Download configuration | config.json |
user_id
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 943 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 461.48475 |
| Minimum | 0 |
|---|---|
| Maximum | 942 |
| Zeros | 272 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 45 |
| Q1 | 253 |
| median | 446 |
| Q3 | 681 |
| 95-th percentile | 891 |
| Maximum | 942 |
| Range | 942 |
| Interquartile range (IQR) | 428 |
Descriptive statistics
| Standard deviation | 266.61442 |
|---|---|
| Coefficient of variation (CV) | 0.57773181 |
| Kurtosis | -1.0973667 |
| Mean | 461.48475 |
| Median Absolute Deviation (MAD) | 213 |
| Skewness | 0.082533291 |
| Sum | 46148475 |
| Variance | 71083.249 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 404 | 737 | 0.7% |
| 654 | 685 | 0.7% |
| 12 | 636 | 0.6% |
| 449 | 540 | 0.5% |
| 275 | 518 | 0.5% |
| 415 | 493 | 0.5% |
| 536 | 490 | 0.5% |
| 302 | 484 | 0.5% |
| 233 | 480 | 0.5% |
| 392 | 448 | 0.4% |
| Other values (933) | 94489 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 62 | 0.1% |
| 2 | 54 | 0.1% |
| 3 | 24 | < 0.1% |
| 4 | 175 | |
| 5 | 211 | |
| 6 | 403 | |
| 7 | 59 | 0.1% |
| 8 | 22 | < 0.1% |
| 9 | 184 |
| Value | Count | Frequency (%) |
| 942 | 168 | |
| 941 | 79 | |
| 940 | 22 | < 0.1% |
| 939 | 107 | |
| 938 | 49 | < 0.1% |
| 937 | 108 | |
| 936 | 40 | < 0.1% |
| 935 | 142 | |
| 934 | 39 | < 0.1% |
| 933 | 174 |
movie_id
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 1682 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 424.53013 |
| Minimum | 0 |
|---|---|
| Maximum | 1681 |
| Zeros | 452 |
| Zeros (%) | 0.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 29 |
| Q1 | 174 |
| median | 321 |
| Q3 | 630 |
| 95-th percentile | 1073 |
| Maximum | 1681 |
| Range | 1681 |
| Interquartile range (IQR) | 456 |
Descriptive statistics
| Standard deviation | 330.79836 |
|---|---|
| Coefficient of variation (CV) | 0.77921055 |
| Kurtosis | 0.42253411 |
| Mean | 424.53013 |
| Median Absolute Deviation (MAD) | 196 |
| Skewness | 0.9863565 |
| Sum | 42453013 |
| Variance | 109427.55 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 49 | 583 | 0.6% |
| 257 | 509 | 0.5% |
| 99 | 508 | 0.5% |
| 180 | 507 | 0.5% |
| 293 | 485 | 0.5% |
| 285 | 481 | 0.5% |
| 287 | 478 | 0.5% |
| 0 | 452 | 0.5% |
| 299 | 431 | 0.4% |
| 120 | 429 | 0.4% |
| Other values (1672) | 95137 |
| Value | Count | Frequency (%) |
| 0 | 452 | |
| 1 | 131 | 0.1% |
| 2 | 90 | 0.1% |
| 3 | 209 | |
| 4 | 86 | 0.1% |
| 5 | 26 | < 0.1% |
| 6 | 392 | |
| 7 | 219 | |
| 8 | 299 | |
| 9 | 89 | 0.1% |
| Value | Count | Frequency (%) |
| 1681 | 1 | |
| 1680 | 1 | |
| 1679 | 1 | |
| 1678 | 1 | |
| 1677 | 1 | |
| 1676 | 1 | |
| 1675 | 1 | |
| 1674 | 1 | |
| 1673 | 1 | |
| 1672 | 1 |
rating
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 4.0 | |
|---|---|
| 3.0 | |
| 5.0 | |
| 2.0 | |
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 300000 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 4.0 |
| 4th row | 4.0 |
| 5th row | 4.0 |
Common Values
| Value | Count | Frequency (%) |
| 4.0 | 34174 | |
| 3.0 | 27145 | |
| 5.0 | 21201 | |
| 2.0 | 11370 | 11.4% |
| 1.0 | 6110 | 6.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4.0 | 34174 | |
| 3.0 | 27145 | |
| 5.0 | 21201 | |
| 2.0 | 11370 | 11.4% |
| 1.0 | 6110 | 6.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 100000 | |
| 0 | 100000 | |
| 4 | 34174 | 11.4% |
| 3 | 27145 | 9.0% |
| 5 | 21201 | 7.1% |
| 2 | 11370 | 3.8% |
| 1 | 6110 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 200000 | |
| Other Punctuation | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 4 | 34174 | 17.1% |
| 3 | 27145 | 13.6% |
| 5 | 21201 | 10.6% |
| 2 | 11370 | 5.7% |
| 1 | 6110 | 3.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 100000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 300000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 100000 | |
| 0 | 100000 | |
| 4 | 34174 | 11.4% |
| 3 | 27145 | 9.0% |
| 5 | 21201 | 7.1% |
| 2 | 11370 | 3.8% |
| 1 | 6110 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 300000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 100000 | |
| 0 | 100000 | |
| 4 | 34174 | 11.4% |
| 3 | 27145 | 9.0% |
| 5 | 21201 | 7.1% |
| 2 | 11370 | 3.8% |
| 1 | 6110 | 2.0% |
unix_timestamp
Real number (ℝ)
| Distinct | 49282 |
|---|---|
| Distinct (%) | 49.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.8352885 × 108 |
| Minimum | 8.7472471 × 108 |
|---|---|
| Maximum | 8.9328664 × 108 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 MiB |
Quantile statistics
| Minimum | 8.7472471 × 108 |
|---|---|
| 5-th percentile | 8.7532031 × 108 |
| Q1 | 8.7944871 × 108 |
| median | 8.8282694 × 108 |
| Q3 | 8.8825998 × 108 |
| 95-th percentile | 8.9171789 × 108 |
| Maximum | 8.9328664 × 108 |
| Range | 18561928 |
| Interquartile range (IQR) | 8811274.5 |
Descriptive statistics
| Standard deviation | 5343856.2 |
|---|---|
| Coefficient of variation (CV) | 0.0060483098 |
| Kurtosis | -1.1687487 |
| Mean | 8.8352885 × 108 |
| Median Absolute Deviation (MAD) | 3886481 |
| Skewness | 0.1738863 |
| Sum | 8.8352885 × 1013 |
| Variance | 2.8556799 × 1013 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 891033606 | 12 | < 0.1% |
| 878962496 | 10 | < 0.1% |
| 881107817 | 10 | < 0.1% |
| 888637768 | 10 | < 0.1% |
| 876896210 | 10 | < 0.1% |
| 891331825 | 10 | < 0.1% |
| 889665232 | 10 | < 0.1% |
| 891034835 | 10 | < 0.1% |
| 884902317 | 10 | < 0.1% |
| 884901497 | 10 | < 0.1% |
| Other values (49272) | 99898 |
| Value | Count | Frequency (%) |
| 874724710 | 1 | |
| 874724727 | 1 | |
| 874724754 | 1 | |
| 874724781 | 1 | |
| 874724843 | 1 | |
| 874724882 | 2 | |
| 874724905 | 1 | |
| 874724937 | 1 | |
| 874724988 | 1 | |
| 874725081 | 1 |
| Value | Count | Frequency (%) |
| 893286638 | 7 | |
| 893286637 | 3 | |
| 893286603 | 1 | < 0.1% |
| 893286584 | 1 | < 0.1% |
| 893286550 | 3 | |
| 893286511 | 2 | < 0.1% |
| 893286502 | 1 | < 0.1% |
| 893286501 | 3 | |
| 893286491 | 1 | < 0.1% |
| 893286373 | 1 | < 0.1% |
title
Text
| Distinct | 1664 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
Length
| Max length | 81 |
|---|---|
| Median length | 61 |
| Mean length | 22.78191 |
| Min length | 7 |
Characters and Unicode
| Total characters | 2278191 |
|---|---|
| Distinct characters | 79 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 134 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Kolya (1996) |
|---|---|
| 2nd row | Men in Black (1997) |
| 3rd row | Truth About Cats & Dogs, The (1996) |
| 4th row | Birdcage, The (1996) |
| 5th row | Adventures of Priscilla, Queen of the Desert, The (1994) |
| Value | Count | Frequency (%) |
| the | 32193 | 8.5% |
| 1996 | 18745 | 5.0% |
| 1997 | 15384 | 4.1% |
| 1995 | 12408 | 3.3% |
| 1994 | 9034 | 2.4% |
| of | 7065 | 1.9% |
| 1993 | 6671 | 1.8% |
| and | 4828 | 1.3% |
| a | 4087 | 1.1% |
| in | 3359 | 0.9% |
| Other values (2470) | 264486 |
Most occurring characters
| Value | Count | Frequency (%) |
| 278275 | 12.2% | |
| 9 | 174495 | 7.7% |
| e | 159599 | 7.0% |
| 1 | 106420 | 4.7% |
| ( | 101985 | 4.5% |
| ) | 101985 | 4.5% |
| a | 101802 | 4.5% |
| n | 88173 | 3.9% |
| r | 87855 | 3.9% |
| o | 86945 | 3.8% |
| Other values (69) | 990657 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1101108 | |
| Decimal Number | 407113 | 17.9% |
| Space Separator | 278275 | 12.2% |
| Uppercase Letter | 244851 | 10.7% |
| Open Punctuation | 101985 | 4.5% |
| Close Punctuation | 101985 | 4.5% |
| Other Punctuation | 41985 | 1.8% |
| Dash Punctuation | 889 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 159599 | |
| a | 101802 | |
| n | 88173 | 8.0% |
| r | 87855 | 8.0% |
| o | 86945 | 7.9% |
| i | 82866 | 7.5% |
| t | 78420 | 7.1% |
| s | 61152 | 5.6% |
| h | 58761 | 5.3% |
| l | 50123 | 4.6% |
| Other values (19) | 245412 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 34916 | |
| S | 22002 | 9.0% |
| M | 15819 | 6.5% |
| B | 15613 | 6.4% |
| C | 14509 | 5.9% |
| A | 14018 | 5.7% |
| D | 13999 | 5.7% |
| F | 13517 | 5.5% |
| L | 11139 | 4.5% |
| W | 10810 | 4.4% |
| Other values (17) | 78509 |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 174495 | |
| 1 | 106420 | |
| 6 | 25582 | 6.3% |
| 7 | 25486 | 6.3% |
| 5 | 18288 | 4.5% |
| 8 | 16283 | 4.0% |
| 4 | 15356 | 3.8% |
| 3 | 11184 | 2.7% |
| 2 | 7195 | 1.8% |
| 0 | 6824 | 1.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 24238 | |
| : | 5024 | 12.0% |
| ' | 4877 | 11.6% |
| . | 4808 | 11.5% |
| & | 1220 | 2.9% |
| ! | 747 | 1.8% |
| * | 627 | 1.5% |
| / | 381 | 0.9% |
| ? | 63 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 278275 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 101985 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 101985 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 889 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1345959 | |
| Common | 932232 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 159599 | 11.9% |
| a | 101802 | 7.6% |
| n | 88173 | 6.6% |
| r | 87855 | 6.5% |
| o | 86945 | 6.5% |
| i | 82866 | 6.2% |
| t | 78420 | 5.8% |
| s | 61152 | 4.5% |
| h | 58761 | 4.4% |
| l | 50123 | 3.7% |
| Other values (46) | 490263 |
Common
| Value | Count | Frequency (%) |
| 278275 | ||
| 9 | 174495 | |
| 1 | 106420 | 11.4% |
| ( | 101985 | 10.9% |
| ) | 101985 | 10.9% |
| 6 | 25582 | 2.7% |
| 7 | 25486 | 2.7% |
| , | 24238 | 2.6% |
| 5 | 18288 | 2.0% |
| 8 | 16283 | 1.7% |
| Other values (13) | 59195 | 6.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2278110 | |
| None | 81 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 278275 | 12.2% | |
| 9 | 174495 | 7.7% |
| e | 159599 | 7.0% |
| 1 | 106420 | 4.7% |
| ( | 101985 | 4.5% |
| ) | 101985 | 4.5% |
| a | 101802 | 4.5% |
| n | 88173 | 3.9% |
| r | 87855 | 3.9% |
| o | 86945 | 3.8% |
| Other values (65) | 990576 |
None
| Value | Count | Frequency (%) |
| é | 75 | |
| è | 4 | 4.9% |
| Á | 1 | 1.2% |
| ö | 1 | 1.2% |
release_date
Date
| Distinct | 240 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 9 |
| Missing (%) | < 0.1% |
| Memory size | 1.5 MiB |
| Minimum | 1922-01-01 00:00:00 |
|---|---|
| Maximum | 1998-10-23 00:00:00 |
video_release_date
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 100000 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 1.5 MiB |
imdb_url
Text
| Distinct | 1660 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 13 |
| Missing (%) | < 0.1% |
| Memory size | 1.5 MiB |
Length
| Max length | 134 |
|---|---|
| Median length | 98 |
| Mean length | 60.15346 |
| Min length | 36 |
Characters and Unicode
| Total characters | 6014564 |
|---|---|
| Distinct characters | 76 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 134 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | http://us.imdb.com/M/title-exact?Kolya%20(1996) |
|---|---|
| 2nd row | http://us.imdb.com/M/title-exact?Men+in+Black+(1997) |
| 3rd row | http://us.imdb.com/M/title-exact?Truth%20About%20Cats%20&%20Dogs,%20The%20(1996) |
| 4th row | http://us.imdb.com/M/title-exact?Birdcage,%20The%20(1996) |
| 5th row | http://us.imdb.com/M/title-exact?Adventures%20of%20Priscilla,%20Queen%20of%20the%20Desert,%20The%20(1994) |
| Value | Count | Frequency (%) |
| http://us.imdb.com/m/title-exact?star%20wars%20(1977 | 583 | 0.6% |
| http://us.imdb.com/title?contact+(1997/i | 509 | 0.5% |
| http://us.imdb.com/m/title-exact?fargo%20(1996 | 508 | 0.5% |
| http://us.imdb.com/m/title-exact?return%20of%20the%20jedi%20(1983 | 507 | 0.5% |
| http://us.imdb.com/title?liar+liar+(1997 | 485 | 0.5% |
| http://us.imdb.com/m/title-exact?english%20patient,%20the%20(1996 | 481 | 0.5% |
| http://us.imdb.com/m/title-exact?scream%20(1996 | 478 | 0.5% |
| http://us.imdb.com/m/title-exact?toy%20story%20(1995 | 452 | 0.5% |
| http://us.imdb.com/m/title-exact?air+force+one+(1997 | 431 | 0.4% |
| http://us.imdb.com/m/title-exact?independence%20day%20(1996 | 429 | 0.4% |
| Other values (1651) | 95152 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 573085 | 9.5% |
| / | 398144 | 6.6% |
| e | 350899 | 5.8% |
| i | 284092 | 4.7% |
| 2 | 257527 | 4.3% |
| % | 250211 | 4.2% |
| 0 | 245929 | 4.1% |
| c | 223080 | 3.7% |
| m | 221714 | 3.7% |
| . | 202953 | 3.4% |
| Other values (66) | 3006930 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3361616 | |
| Other Punctuation | 1081180 | 18.0% |
| Decimal Number | 904928 | 15.0% |
| Uppercase Letter | 342634 | 5.7% |
| Dash Punctuation | 101227 | 1.7% |
| Open Punctuation | 95825 | 1.6% |
| Close Punctuation | 95825 | 1.6% |
| Math Symbol | 31301 | 0.5% |
| Space Separator | 28 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 573085 | |
| e | 350899 | |
| i | 284092 | 8.5% |
| c | 223080 | 6.6% |
| m | 221714 | 6.6% |
| a | 195679 | 5.8% |
| o | 184357 | 5.5% |
| s | 159368 | 4.7% |
| h | 156021 | 4.6% |
| l | 151145 | 4.5% |
| Other values (16) | 862176 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 111140 | |
| T | 36507 | 10.7% |
| S | 21446 | 6.3% |
| C | 16953 | 4.9% |
| B | 15232 | 4.4% |
| A | 14554 | 4.2% |
| F | 13239 | 3.9% |
| D | 12893 | 3.8% |
| L | 10599 | 3.1% |
| P | 10496 | 3.1% |
| Other values (16) | 79575 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 257527 | |
| 0 | 245929 | |
| 9 | 174360 | |
| 1 | 108125 | |
| 6 | 26434 | 2.9% |
| 7 | 25359 | 2.8% |
| 8 | 19692 | 2.2% |
| 5 | 18823 | 2.1% |
| 4 | 15426 | 1.7% |
| 3 | 13253 | 1.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 398144 | |
| % | 250211 | |
| . | 202953 | |
| : | 104429 | 9.7% |
| ? | 99999 | 9.2% |
| , | 20627 | 1.9% |
| ' | 3472 | 0.3% |
| & | 804 | 0.1% |
| ! | 541 | 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 101227 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 95825 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 95825 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 31301 |
Space Separator
| Value | Count | Frequency (%) |
| 28 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3704250 | |
| Common | 2310314 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 573085 | |
| e | 350899 | 9.5% |
| i | 284092 | 7.7% |
| c | 223080 | 6.0% |
| m | 221714 | 6.0% |
| a | 195679 | 5.3% |
| o | 184357 | 5.0% |
| s | 159368 | 4.3% |
| h | 156021 | 4.2% |
| l | 151145 | 4.1% |
| Other values (42) | 1204810 |
Common
| Value | Count | Frequency (%) |
| / | 398144 | |
| 2 | 257527 | |
| % | 250211 | |
| 0 | 245929 | |
| . | 202953 | |
| 9 | 174360 | |
| 1 | 108125 | 4.7% |
| : | 104429 | 4.5% |
| - | 101227 | 4.4% |
| ? | 99999 | 4.3% |
| Other values (14) | 367410 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6014564 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 573085 | 9.5% |
| / | 398144 | 6.6% |
| e | 350899 | 5.8% |
| i | 284092 | 4.7% |
| 2 | 257527 | 4.3% |
| % | 250211 | 4.2% |
| 0 | 245929 | 4.1% |
| c | 223080 | 3.7% |
| m | 221714 | 3.7% |
| . | 202953 | 3.4% |
| Other values (66) | 3006930 |
unknown
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 10 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 99990 | |
| 1 | 10 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 99990 | |
| 1 | 10 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 99990 | |
| 1 | 10 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 99990 | |
| 1 | 10 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 99990 | |
| 1 | 10 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 99990 | |
| 1 | 10 | < 0.1% |
Action
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 74411 | |
| 1 | 25589 | 25.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 74411 | |
| 1 | 25589 | 25.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 74411 | |
| 1 | 25589 | 25.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 74411 | |
| 1 | 25589 | 25.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 74411 | |
| 1 | 25589 | 25.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 74411 | |
| 1 | 25589 | 25.6% |
Adventure
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 86247 | |
| 1 | 13753 | 13.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 86247 | |
| 1 | 13753 | 13.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 86247 | |
| 1 | 13753 | 13.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 86247 | |
| 1 | 13753 | 13.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 86247 | |
| 1 | 13753 | 13.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 86247 | |
| 1 | 13753 | 13.8% |
Animation
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 3605 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 96395 | |
| 1 | 3605 | 3.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 96395 | |
| 1 | 3605 | 3.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 96395 | |
| 1 | 3605 | 3.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 96395 | |
| 1 | 3605 | 3.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 96395 | |
| 1 | 3605 | 3.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 96395 | |
| 1 | 3605 | 3.6% |
Children's
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 7182 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 92818 | |
| 1 | 7182 | 7.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 92818 | |
| 1 | 7182 | 7.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 92818 | |
| 1 | 7182 | 7.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 92818 | |
| 1 | 7182 | 7.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 92818 | |
| 1 | 7182 | 7.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 92818 | |
| 1 | 7182 | 7.2% |
Comedy
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 70168 | |
| 1 | 29832 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 70168 | |
| 1 | 29832 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 70168 | |
| 1 | 29832 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 70168 | |
| 1 | 29832 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 70168 | |
| 1 | 29832 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 70168 | |
| 1 | 29832 |
Crime
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 8055 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 91945 | |
| 1 | 8055 | 8.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 91945 | |
| 1 | 8055 | 8.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 91945 | |
| 1 | 8055 | 8.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 91945 | |
| 1 | 8055 | 8.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 91945 | |
| 1 | 8055 | 8.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 91945 | |
| 1 | 8055 | 8.1% |
Documentary
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 758 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 99242 | |
| 1 | 758 | 0.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 99242 | |
| 1 | 758 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 99242 | |
| 1 | 758 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 99242 | |
| 1 | 758 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 99242 | |
| 1 | 758 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 99242 | |
| 1 | 758 | 0.8% |
Drama
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 60105 | |
| 1 | 39895 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 60105 | |
| 1 | 39895 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 60105 | |
| 1 | 39895 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 60105 | |
| 1 | 39895 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 60105 | |
| 1 | 39895 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 60105 | |
| 1 | 39895 |
Fantasy
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 1352 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 98648 | |
| 1 | 1352 | 1.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 98648 | |
| 1 | 1352 | 1.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 98648 | |
| 1 | 1352 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 98648 | |
| 1 | 1352 | 1.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 98648 | |
| 1 | 1352 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 98648 | |
| 1 | 1352 | 1.4% |
Film-Noir
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 1733 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 98267 | |
| 1 | 1733 | 1.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 98267 | |
| 1 | 1733 | 1.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 98267 | |
| 1 | 1733 | 1.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 98267 | |
| 1 | 1733 | 1.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 98267 | |
| 1 | 1733 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 98267 | |
| 1 | 1733 | 1.7% |
Horror
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 5317 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 94683 | |
| 1 | 5317 | 5.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 94683 | |
| 1 | 5317 | 5.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 94683 | |
| 1 | 5317 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 94683 | |
| 1 | 5317 | 5.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 94683 | |
| 1 | 5317 | 5.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 94683 | |
| 1 | 5317 | 5.3% |
Musical
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 4954 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 95046 | |
| 1 | 4954 | 5.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 95046 | |
| 1 | 4954 | 5.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 95046 | |
| 1 | 4954 | 5.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 95046 | |
| 1 | 4954 | 5.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 95046 | |
| 1 | 4954 | 5.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 95046 | |
| 1 | 4954 | 5.0% |
Mystery
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 5245 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 94755 | |
| 1 | 5245 | 5.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 94755 | |
| 1 | 5245 | 5.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 94755 | |
| 1 | 5245 | 5.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 94755 | |
| 1 | 5245 | 5.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 94755 | |
| 1 | 5245 | 5.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 94755 | |
| 1 | 5245 | 5.2% |
Romance
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 80539 | |
| 1 | 19461 | 19.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 80539 | |
| 1 | 19461 | 19.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 80539 | |
| 1 | 19461 | 19.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 80539 | |
| 1 | 19461 | 19.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 80539 | |
| 1 | 19461 | 19.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 80539 | |
| 1 | 19461 | 19.5% |
Sci-Fi
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 87270 | |
| 1 | 12730 | 12.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 87270 | |
| 1 | 12730 | 12.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 87270 | |
| 1 | 12730 | 12.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 87270 | |
| 1 | 12730 | 12.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 87270 | |
| 1 | 12730 | 12.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 87270 | |
| 1 | 12730 | 12.7% |
Thriller
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 78128 | |
| 1 | 21872 | 21.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 78128 | |
| 1 | 21872 | 21.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 78128 | |
| 1 | 21872 | 21.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 78128 | |
| 1 | 21872 | 21.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 78128 | |
| 1 | 21872 | 21.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 78128 | |
| 1 | 21872 | 21.9% |
War
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 90602 | |
| 1 | 9398 | 9.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 90602 | |
| 1 | 9398 | 9.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 90602 | |
| 1 | 9398 | 9.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 90602 | |
| 1 | 9398 | 9.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 90602 | |
| 1 | 9398 | 9.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 90602 | |
| 1 | 9398 | 9.4% |
Western
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| 0 | |
|---|---|
| 1 | 1854 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 98146 | |
| 1 | 1854 | 1.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 98146 | |
| 1 | 1854 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 98146 | |
| 1 | 1854 | 1.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 98146 | |
| 1 | 1854 | 1.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 98146 | |
| 1 | 1854 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 98146 | |
| 1 | 1854 | 1.9% |
year
Date
| Distinct | 71 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 9 |
| Missing (%) | < 0.1% |
| Memory size | 1.5 MiB |
| Minimum | 1922-01-01 00:00:00 |
|---|---|
| Maximum | 1998-01-01 00:00:00 |
genre
Categorical
HIGH CORRELATION 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| Drama | |
|---|---|
| Comedy | |
| Thriller | |
| Romance | |
| Action | |
| Other values (14) |
Length
| Max length | 11 |
|---|---|
| Median length | 10 |
| Mean length | 6.41247 |
| Min length | 3 |
Characters and Unicode
| Total characters | 641247 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Comedy |
|---|---|
| 2nd row | Action |
| 3rd row | Comedy |
| 4th row | Comedy |
| 5th row | Comedy |
Common Values
| Value | Count | Frequency (%) |
| Drama | 23849 | |
| Comedy | 17935 | |
| Thriller | 12854 | |
| Romance | 8208 | 8.2% |
| Action | 7371 | 7.4% |
| Adventure | 5963 | 6.0% |
| Sci-Fi | 3973 | 4.0% |
| Children's | 3720 | 3.7% |
| War | 3681 | 3.7% |
| Crime | 2886 | 2.9% |
| Other values (9) | 9560 |
Length
| Value | Count | Frequency (%) |
| drama | 23849 | |
| comedy | 17935 | |
| thriller | 12854 | |
| romance | 8208 | 8.2% |
| action | 7371 | 7.4% |
| adventure | 5963 | 6.0% |
| sci-fi | 3973 | 4.0% |
| children's | 3720 | 3.7% |
| war | 3681 | 3.7% |
| crime | 2886 | 2.9% |
| Other values (9) | 9560 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 78358 | |
| a | 63340 | 9.9% |
| e | 62504 | 9.7% |
| m | 54886 | 8.6% |
| o | 41028 | 6.4% |
| i | 38695 | 6.0% |
| l | 31515 | 4.9% |
| n | 29049 | 4.5% |
| d | 27618 | 4.3% |
| D | 24605 | 3.8% |
| Other values (21) | 189649 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 528245 | |
| Uppercase Letter | 104636 | 16.3% |
| Dash Punctuation | 4646 | 0.7% |
| Other Punctuation | 3720 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 78358 | |
| a | 63340 | |
| e | 62504 | |
| m | 54886 | |
| o | 41028 | |
| i | 38695 | |
| l | 31515 | 6.0% |
| n | 29049 | 5.5% |
| d | 27618 | 5.2% |
| y | 22267 | 4.2% |
| Other values (8) | 78985 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 24605 | |
| C | 24541 | |
| A | 13913 | |
| T | 12854 | |
| R | 8208 | 7.8% |
| F | 5148 | 4.9% |
| W | 5022 | 4.8% |
| S | 3973 | 3.8% |
| M | 2951 | 2.8% |
| H | 2748 | 2.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4646 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 3720 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 632881 | |
| Common | 8366 | 1.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 78358 | |
| a | 63340 | 10.0% |
| e | 62504 | 9.9% |
| m | 54886 | 8.7% |
| o | 41028 | 6.5% |
| i | 38695 | 6.1% |
| l | 31515 | 5.0% |
| n | 29049 | 4.6% |
| d | 27618 | 4.4% |
| D | 24605 | 3.9% |
| Other values (19) | 181283 |
Common
| Value | Count | Frequency (%) |
| - | 4646 | |
| ' | 3720 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 641247 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 78358 | |
| a | 63340 | 9.9% |
| e | 62504 | 9.7% |
| m | 54886 | 8.6% |
| o | 41028 | 6.4% |
| i | 38695 | 6.0% |
| l | 31515 | 4.9% |
| n | 29049 | 4.5% |
| d | 27618 | 4.3% |
| D | 24605 | 3.8% |
| Other values (21) | 189649 |
all_genres
Text
| Distinct | 216 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
Length
| Max length | 50 |
|---|---|
| Median length | 40 |
| Mean length | 14.78432 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1478432 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Comedy |
|---|---|
| 2nd row | Action-Adventure-Comedy-Sci-Fi |
| 3rd row | Comedy-Romance |
| 4th row | Comedy |
| 5th row | Comedy-Drama |
| Value | Count | Frequency (%) |
| drama | 13257 | 13.3% |
| comedy | 9828 | 9.8% |
| comedy-romance | 5055 | 5.1% |
| drama-romance | 4767 | 4.8% |
| action-thriller | 3550 | 3.5% |
| drama-thriller | 2627 | 2.6% |
| comedy-drama | 2422 | 2.4% |
| drama-war | 2012 | 2.0% |
| action-adventure-sci-fi | 1865 | 1.9% |
| horror | 1558 | 1.6% |
| Other values (206) | 53059 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 147568 | 10.0% |
| - | 127058 | 8.6% |
| e | 123619 | 8.4% |
| a | 120670 | 8.2% |
| i | 103788 | 7.0% |
| m | 103339 | 7.0% |
| o | 91622 | 6.2% |
| n | 77189 | 5.2% |
| c | 63492 | 4.3% |
| l | 57613 | 3.9% |
| Other values (21) | 462474 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1117144 | |
| Uppercase Letter | 227048 | 15.4% |
| Dash Punctuation | 127058 | 8.6% |
| Other Punctuation | 7182 | 0.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 147568 | |
| e | 123619 | |
| a | 120670 | |
| i | 103788 | |
| m | 103339 | |
| o | 91622 | |
| n | 77189 | |
| c | 63492 | 5.7% |
| l | 57613 | 5.2% |
| t | 52156 | 4.7% |
| Other values (8) | 176088 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 45069 | |
| A | 42947 | |
| D | 40653 | |
| T | 21872 | |
| R | 19461 | |
| F | 15815 | 7.0% |
| S | 12730 | 5.6% |
| W | 11252 | 5.0% |
| M | 10199 | 4.5% |
| H | 5317 | 2.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 127058 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 7182 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1344192 | |
| Common | 134240 | 9.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 147568 | 11.0% |
| e | 123619 | 9.2% |
| a | 120670 | 9.0% |
| i | 103788 | 7.7% |
| m | 103339 | 7.7% |
| o | 91622 | 6.8% |
| n | 77189 | 5.7% |
| c | 63492 | 4.7% |
| l | 57613 | 4.3% |
| t | 52156 | 3.9% |
| Other values (19) | 403136 |
Common
| Value | Count | Frequency (%) |
| - | 127058 | |
| ' | 7182 | 5.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1478432 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 147568 | 10.0% |
| - | 127058 | 8.6% |
| e | 123619 | 8.4% |
| a | 120670 | 8.2% |
| i | 103788 | 7.0% |
| m | 103339 | 7.0% |
| o | 91622 | 6.2% |
| n | 77189 | 5.2% |
| c | 63492 | 4.3% |
| l | 57613 | 3.9% |
| Other values (21) | 462474 |
age
Real number (ℝ)
| Distinct | 61 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32.96985 |
| Minimum | 7 |
|---|---|
| Maximum | 73 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 MiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 24 |
| median | 30 |
| Q3 | 40 |
| 95-th percentile | 55 |
| Maximum | 73 |
| Range | 66 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 11.562623 |
|---|---|
| Coefficient of variation (CV) | 0.35070294 |
| Kurtosis | -0.1684152 |
| Mean | 32.96985 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.7331067 |
| Sum | 3296985 |
| Variance | 133.69426 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 27 | 6423 | 6.4% |
| 24 | 4556 | 4.6% |
| 20 | 4089 | 4.1% |
| 25 | 4013 | 4.0% |
| 22 | 3979 | 4.0% |
| 30 | 3762 | 3.8% |
| 29 | 3650 | 3.6% |
| 28 | 3619 | 3.6% |
| 32 | 3526 | 3.5% |
| 19 | 3514 | 3.5% |
| Other values (51) | 58869 |
| Value | Count | Frequency (%) |
| 7 | 43 | < 0.1% |
| 10 | 31 | < 0.1% |
| 11 | 27 | < 0.1% |
| 13 | 497 | 0.5% |
| 14 | 264 | 0.3% |
| 15 | 397 | 0.4% |
| 16 | 335 | 0.3% |
| 17 | 897 | 0.9% |
| 18 | 2219 | |
| 19 | 3514 |
| Value | Count | Frequency (%) |
| 73 | 56 | 0.1% |
| 70 | 141 | |
| 69 | 156 | |
| 68 | 92 | 0.1% |
| 66 | 37 | < 0.1% |
| 65 | 229 | |
| 64 | 95 | 0.1% |
| 63 | 77 | 0.1% |
| 62 | 46 | < 0.1% |
| 61 | 282 |
sex
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| M | |
|---|---|
| F |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | M |
| 3rd row | M |
| 4th row | M |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| M | 74260 | |
| F | 25740 | 25.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m | 74260 | |
| f | 25740 | 25.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 74260 | |
| F | 25740 | 25.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 100000 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 74260 | |
| F | 25740 | 25.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 100000 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 74260 | |
| F | 25740 | 25.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| M | 74260 | |
| F | 25740 | 25.7% |
occupation
Categorical
HIGH CORRELATION 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
| student | |
|---|---|
| other | |
| educator | |
| engineer | |
| programmer | |
| Other values (16) |
Length
| Max length | 13 |
|---|---|
| Median length | 9 |
| Mean length | 8.10458 |
| Min length | 4 |
Characters and Unicode
| Total characters | 810458 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | writer |
|---|---|
| 2nd row | writer |
| 3rd row | writer |
| 4th row | writer |
| 5th row | writer |
Common Values
| Value | Count | Frequency (%) |
| student | 21957 | |
| other | 10663 | |
| educator | 9442 | |
| engineer | 8175 | 8.2% |
| programmer | 7801 | 7.8% |
| administrator | 7479 | 7.5% |
| writer | 5536 | 5.5% |
| librarian | 5273 | 5.3% |
| technician | 3506 | 3.5% |
| executive | 3403 | 3.4% |
| Other values (11) | 16765 |
Length
| Value | Count | Frequency (%) |
| student | 21957 | |
| other | 10663 | |
| educator | 9442 | |
| engineer | 8175 | 8.2% |
| programmer | 7801 | 7.8% |
| administrator | 7479 | 7.5% |
| writer | 5536 | 5.5% |
| librarian | 5273 | 5.3% |
| technician | 3506 | 3.5% |
| executive | 3403 | 3.4% |
| Other values (11) | 16765 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 116458 | |
| t | 113342 | |
| r | 102818 | |
| n | 71022 | |
| i | 61708 | |
| a | 61570 | |
| d | 41027 | 5.1% |
| o | 37665 | 4.6% |
| s | 37572 | 4.6% |
| u | 34802 | 4.3% |
| Other values (12) | 132474 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 810458 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 116458 | |
| t | 113342 | |
| r | 102818 | |
| n | 71022 | |
| i | 61708 | |
| a | 61570 | |
| d | 41027 | 5.1% |
| o | 37665 | 4.6% |
| s | 37572 | 4.6% |
| u | 34802 | 4.3% |
| Other values (12) | 132474 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 810458 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 116458 | |
| t | 113342 | |
| r | 102818 | |
| n | 71022 | |
| i | 61708 | |
| a | 61570 | |
| d | 41027 | 5.1% |
| o | 37665 | 4.6% |
| s | 37572 | 4.6% |
| u | 34802 | 4.3% |
| Other values (12) | 132474 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 810458 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 116458 | |
| t | 113342 | |
| r | 102818 | |
| n | 71022 | |
| i | 61708 | |
| a | 61570 | |
| d | 41027 | 5.1% |
| o | 37665 | 4.6% |
| s | 37572 | 4.6% |
| u | 34802 | 4.3% |
| Other values (12) | 132474 |
zip_code
Text
| Distinct | 795 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 MiB |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 500000 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 55105 |
|---|---|
| 2nd row | 55105 |
| 3rd row | 55105 |
| 4th row | 55105 |
| 5th row | 55105 |
| Value | Count | Frequency (%) |
| 55414 | 1103 | 1.1% |
| 20009 | 878 | 0.9% |
| 10019 | 850 | 0.9% |
| 22902 | 832 | 0.8% |
| 61820 | 817 | 0.8% |
| 48103 | 746 | 0.7% |
| 10003 | 736 | 0.7% |
| 60657 | 685 | 0.7% |
| 80525 | 678 | 0.7% |
| 83702 | 639 | 0.6% |
| Other values (785) | 92036 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 84079 | |
| 1 | 66289 | |
| 2 | 58132 | |
| 5 | 53721 | |
| 4 | 45130 | |
| 3 | 43212 | |
| 9 | 38841 | |
| 7 | 35743 | |
| 6 | 35322 | |
| 8 | 33273 | 6.7% |
| Other values (16) | 6258 | 1.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 493742 | |
| Uppercase Letter | 6258 | 1.3% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 1262 | |
| A | 812 | |
| H | 627 | |
| V | 612 | |
| L | 569 | |
| E | 534 | |
| T | 359 | 5.7% |
| P | 316 | 5.0% |
| B | 309 | 4.9% |
| R | 214 | 3.4% |
| Other values (6) | 644 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 84079 | |
| 1 | 66289 | |
| 2 | 58132 | |
| 5 | 53721 | |
| 4 | 45130 | |
| 3 | 43212 | |
| 9 | 38841 | |
| 7 | 35743 | |
| 6 | 35322 | |
| 8 | 33273 | 6.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 493742 | |
| Latin | 6258 | 1.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 1262 | |
| A | 812 | |
| H | 627 | |
| V | 612 | |
| L | 569 | |
| E | 534 | |
| T | 359 | 5.7% |
| P | 316 | 5.0% |
| B | 309 | 4.9% |
| R | 214 | 3.4% |
| Other values (6) | 644 |
Common
| Value | Count | Frequency (%) |
| 0 | 84079 | |
| 1 | 66289 | |
| 2 | 58132 | |
| 5 | 53721 | |
| 4 | 45130 | |
| 3 | 43212 | |
| 9 | 38841 | |
| 7 | 35743 | |
| 6 | 35322 | |
| 8 | 33273 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 500000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 84079 | |
| 1 | 66289 | |
| 2 | 58132 | |
| 5 | 53721 | |
| 4 | 45130 | |
| 3 | 43212 | |
| 9 | 38841 | |
| 7 | 35743 | |
| 6 | 35322 | |
| 8 | 33273 | 6.7% |
| Other values (16) | 6258 | 1.3% |
| user_id | movie_id | unix_timestamp | age | rating | unknown | Action | Adventure | Animation | Children's | Comedy | Crime | Documentary | Drama | Fantasy | Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western | genre | sex | occupation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| user_id | 1.000 | -0.007 | 0.038 | -0.005 | 0.278 | 0.000 | 0.227 | 0.155 | 0.139 | 0.185 | 0.176 | 0.094 | 0.061 | 0.207 | 0.000 | 0.072 | 0.207 | 0.102 | 0.119 | 0.118 | 0.142 | 0.177 | 0.083 | 0.040 | 0.085 | 0.995 | 0.995 |
| movie_id | -0.007 | 1.000 | 0.025 | 0.010 | 0.260 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.992 | 0.146 | 0.052 |
| unix_timestamp | 0.038 | 0.025 | 1.000 | 0.122 | 0.039 | 0.023 | 0.032 | 0.028 | 0.019 | 0.023 | 0.022 | 0.007 | 0.005 | 0.044 | 0.012 | 0.016 | 0.025 | 0.005 | 0.030 | 0.014 | 0.028 | 0.018 | 0.022 | 0.000 | 0.022 | 0.084 | 0.211 |
| age | -0.005 | 0.010 | 0.122 | 1.000 | 0.045 | 0.000 | 0.065 | 0.039 | 0.038 | 0.046 | 0.038 | 0.019 | 0.016 | 0.080 | 0.012 | 0.034 | 0.040 | 0.021 | 0.032 | 0.023 | 0.044 | 0.040 | 0.037 | 0.009 | 0.036 | 0.125 | 0.361 |
| rating | 0.278 | 0.260 | 0.039 | 0.045 | 1.000 | 0.003 | 0.033 | 0.020 | 0.009 | 0.045 | 0.079 | 0.028 | 0.018 | 0.115 | 0.034 | 0.047 | 0.051 | 0.006 | 0.023 | 0.040 | 0.018 | 0.022 | 0.088 | 0.015 | 0.071 | 0.045 | 0.081 |
| unknown | 0.000 | 0.992 | 0.023 | 0.000 | 0.003 | 1.000 | 0.004 | 0.000 | 0.000 | 0.000 | 0.004 | 0.000 | 0.000 | 0.006 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.003 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 |
| Action | 0.227 | 0.992 | 0.032 | 0.065 | 0.033 | 0.004 | 1.000 | 0.451 | 0.099 | 0.145 | 0.223 | 0.007 | 0.051 | 0.270 | 0.013 | 0.078 | 0.007 | 0.091 | 0.033 | 0.018 | 0.324 | 0.250 | 0.167 | 0.063 | 0.672 | 0.060 | 0.084 |
| Adventure | 0.155 | 0.992 | 0.028 | 0.039 | 0.020 | 0.000 | 0.451 | 1.000 | 0.024 | 0.100 | 0.113 | 0.030 | 0.035 | 0.224 | 0.088 | 0.053 | 0.059 | 0.025 | 0.043 | 0.018 | 0.295 | 0.049 | 0.087 | 0.010 | 0.697 | 0.026 | 0.052 |
| Animation | 0.139 | 0.992 | 0.019 | 0.038 | 0.009 | 0.000 | 0.099 | 0.024 | 1.000 | 0.555 | 0.029 | 0.057 | 0.016 | 0.157 | 0.026 | 0.025 | 0.028 | 0.418 | 0.045 | 0.085 | 0.045 | 0.077 | 0.056 | 0.026 | 0.680 | 0.008 | 0.036 |
| Children's | 0.185 | 0.992 | 0.023 | 0.046 | 0.045 | 0.000 | 0.145 | 0.100 | 0.555 | 1.000 | 0.083 | 0.082 | 0.024 | 0.130 | 0.238 | 0.037 | 0.066 | 0.381 | 0.055 | 0.119 | 0.042 | 0.144 | 0.085 | 0.031 | 0.761 | 0.034 | 0.038 |
| Comedy | 0.176 | 0.992 | 0.022 | 0.038 | 0.079 | 0.004 | 0.223 | 0.113 | 0.029 | 0.083 | 1.000 | 0.091 | 0.057 | 0.347 | 0.017 | 0.086 | 0.074 | 0.035 | 0.111 | 0.096 | 0.146 | 0.290 | 0.120 | 0.000 | 0.756 | 0.019 | 0.037 |
| Crime | 0.094 | 0.992 | 0.007 | 0.019 | 0.028 | 0.000 | 0.007 | 0.030 | 0.057 | 0.082 | 0.091 | 1.000 | 0.025 | 0.064 | 0.005 | 0.164 | 0.015 | 0.067 | 0.088 | 0.102 | 0.087 | 0.124 | 0.095 | 0.040 | 0.616 | 0.023 | 0.017 |
| Documentary | 0.061 | 0.992 | 0.005 | 0.016 | 0.018 | 0.000 | 0.051 | 0.035 | 0.016 | 0.024 | 0.057 | 0.025 | 1.000 | 0.058 | 0.009 | 0.011 | 0.020 | 0.019 | 0.020 | 0.043 | 0.033 | 0.046 | 0.006 | 0.011 | 0.999 | 0.000 | 0.031 |
| Drama | 0.207 | 0.992 | 0.044 | 0.080 | 0.115 | 0.006 | 0.270 | 0.224 | 0.157 | 0.130 | 0.347 | 0.064 | 0.058 | 1.000 | 0.020 | 0.083 | 0.159 | 0.096 | 0.069 | 0.013 | 0.174 | 0.163 | 0.099 | 0.033 | 0.742 | 0.034 | 0.082 |
| Fantasy | 0.000 | 0.992 | 0.012 | 0.012 | 0.034 | 0.000 | 0.013 | 0.088 | 0.026 | 0.238 | 0.017 | 0.005 | 0.009 | 0.020 | 1.000 | 0.015 | 0.027 | 0.026 | 0.027 | 0.017 | 0.126 | 0.047 | 0.037 | 0.015 | 0.614 | 0.000 | 0.010 |
| Film-Noir | 0.072 | 0.992 | 0.016 | 0.034 | 0.047 | 0.000 | 0.078 | 0.053 | 0.025 | 0.037 | 0.086 | 0.164 | 0.011 | 0.083 | 0.015 | 1.000 | 0.031 | 0.030 | 0.232 | 0.055 | 0.016 | 0.110 | 0.043 | 0.018 | 0.645 | 0.010 | 0.029 |
| Horror | 0.207 | 0.992 | 0.025 | 0.040 | 0.051 | 0.000 | 0.007 | 0.059 | 0.028 | 0.066 | 0.074 | 0.015 | 0.020 | 0.159 | 0.027 | 0.031 | 1.000 | 0.054 | 0.000 | 0.076 | 0.034 | 0.070 | 0.076 | 0.032 | 0.718 | 0.017 | 0.049 |
| Musical | 0.102 | 0.992 | 0.005 | 0.021 | 0.006 | 0.000 | 0.091 | 0.025 | 0.418 | 0.381 | 0.035 | 0.067 | 0.019 | 0.096 | 0.026 | 0.030 | 0.054 | 1.000 | 0.054 | 0.010 | 0.081 | 0.111 | 0.055 | 0.031 | 0.640 | 0.017 | 0.021 |
| Mystery | 0.119 | 0.992 | 0.030 | 0.032 | 0.023 | 0.000 | 0.033 | 0.043 | 0.045 | 0.055 | 0.111 | 0.088 | 0.020 | 0.069 | 0.027 | 0.232 | 0.000 | 0.054 | 1.000 | 0.060 | 0.031 | 0.230 | 0.076 | 0.032 | 0.579 | 0.002 | 0.029 |
| Romance | 0.118 | 0.992 | 0.014 | 0.023 | 0.040 | 0.002 | 0.018 | 0.018 | 0.085 | 0.119 | 0.096 | 0.102 | 0.043 | 0.013 | 0.017 | 0.055 | 0.076 | 0.010 | 0.060 | 1.000 | 0.063 | 0.106 | 0.127 | 0.052 | 0.638 | 0.049 | 0.039 |
| Sci-Fi | 0.142 | 0.992 | 0.028 | 0.044 | 0.018 | 0.000 | 0.324 | 0.295 | 0.045 | 0.042 | 0.146 | 0.087 | 0.033 | 0.174 | 0.126 | 0.016 | 0.034 | 0.081 | 0.031 | 0.063 | 1.000 | 0.047 | 0.167 | 0.052 | 0.613 | 0.044 | 0.055 |
| Thriller | 0.177 | 0.992 | 0.018 | 0.040 | 0.022 | 0.003 | 0.250 | 0.049 | 0.077 | 0.144 | 0.290 | 0.124 | 0.046 | 0.163 | 0.047 | 0.110 | 0.070 | 0.111 | 0.230 | 0.106 | 0.047 | 1.000 | 0.100 | 0.073 | 0.770 | 0.030 | 0.047 |
| War | 0.083 | 0.992 | 0.022 | 0.037 | 0.088 | 0.000 | 0.167 | 0.087 | 0.056 | 0.085 | 0.120 | 0.095 | 0.006 | 0.099 | 0.037 | 0.043 | 0.076 | 0.055 | 0.076 | 0.127 | 0.167 | 0.100 | 1.000 | 0.023 | 0.636 | 0.018 | 0.038 |
| Western | 0.040 | 0.992 | 0.000 | 0.009 | 0.015 | 0.000 | 0.063 | 0.010 | 0.026 | 0.031 | 0.000 | 0.040 | 0.011 | 0.033 | 0.015 | 0.018 | 0.032 | 0.031 | 0.032 | 0.052 | 0.052 | 0.073 | 0.023 | 1.000 | 0.853 | 0.018 | 0.021 |
| genre | 0.085 | 0.992 | 0.022 | 0.036 | 0.071 | 1.000 | 0.672 | 0.697 | 0.680 | 0.761 | 0.756 | 0.616 | 0.999 | 0.742 | 0.614 | 0.645 | 0.718 | 0.640 | 0.579 | 0.638 | 0.613 | 0.770 | 0.636 | 0.853 | 1.000 | 0.081 | 0.029 |
| sex | 0.995 | 0.146 | 0.084 | 0.125 | 0.045 | 0.000 | 0.060 | 0.026 | 0.008 | 0.034 | 0.019 | 0.023 | 0.000 | 0.034 | 0.000 | 0.010 | 0.017 | 0.017 | 0.002 | 0.049 | 0.044 | 0.030 | 0.018 | 0.018 | 0.081 | 1.000 | 0.409 |
| occupation | 0.995 | 0.052 | 0.211 | 0.361 | 0.081 | 0.000 | 0.084 | 0.052 | 0.036 | 0.038 | 0.037 | 0.017 | 0.031 | 0.082 | 0.010 | 0.029 | 0.049 | 0.021 | 0.029 | 0.039 | 0.055 | 0.047 | 0.038 | 0.021 | 0.029 | 0.409 | 1.000 |
| user_id | movie_id | rating | unix_timestamp | title | release_date | video_release_date | imdb_url | unknown | Action | Adventure | Animation | Children's | Comedy | Crime | Documentary | Drama | Fantasy | Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western | year | genre | all_genres | age | sex | occupation | zip_code | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 195 | 241 | 3.000 | 881250949 | Kolya (1996) | 24-Jan-1997 | NaN | http://us.imdb.com/M/title-exact?Kolya%20(1996) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1997 | Comedy | Comedy | 49 | M | writer | 55105 |
| 1 | 195 | 256 | 2.000 | 881251577 | Men in Black (1997) | 04-Jul-1997 | NaN | http://us.imdb.com/M/title-exact?Men+in+Black+(1997) | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1997 | Action | Action-Adventure-Comedy-Sci-Fi | 49 | M | writer | 55105 |
| 2 | 195 | 110 | 4.000 | 881251793 | Truth About Cats & Dogs, The (1996) | 26-Apr-1996 | NaN | http://us.imdb.com/M/title-exact?Truth%20About%20Cats%20&%20Dogs,%20The%20(1996) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1996 | Comedy | Comedy-Romance | 49 | M | writer | 55105 |
| 3 | 195 | 24 | 4.000 | 881251955 | Birdcage, The (1996) | 08-Mar-1996 | NaN | http://us.imdb.com/M/title-exact?Birdcage,%20The%20(1996) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1996 | Comedy | Comedy | 49 | M | writer | 55105 |
| 4 | 195 | 381 | 4.000 | 881251843 | Adventures of Priscilla, Queen of the Desert, The (1994) | 01-Jan-1994 | NaN | http://us.imdb.com/M/title-exact?Adventures%20of%20Priscilla,%20Queen%20of%20the%20Desert,%20The%20(1994) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1994 | Comedy | Comedy-Drama | 49 | M | writer | 55105 |
| 5 | 195 | 201 | 3.000 | 881251728 | Groundhog Day (1993) | 01-Jan-1993 | NaN | http://us.imdb.com/M/title-exact?Groundhog%20Day%20(1993) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1993 | Romance | Comedy-Romance | 49 | M | writer | 55105 |
| 6 | 195 | 152 | 5.000 | 881251820 | Fish Called Wanda, A (1988) | 01-Jan-1988 | NaN | http://us.imdb.com/M/title-exact?Fish%20Called%20Wanda,%20A%20(1988) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1988 | Comedy | Comedy | 49 | M | writer | 55105 |
| 7 | 195 | 285 | 5.000 | 881250949 | English Patient, The (1996) | 15-Nov-1996 | NaN | http://us.imdb.com/M/title-exact?English%20Patient,%20The%20(1996) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1996 | Drama | Drama-Romance-War | 49 | M | writer | 55105 |
| 8 | 195 | 65 | 3.000 | 881251911 | While You Were Sleeping (1995) | 01-Jan-1995 | NaN | http://us.imdb.com/M/title-exact?While%20You%20Were%20Sleeping%20(1995) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1995 | Comedy | Comedy-Romance | 49 | M | writer | 55105 |
| 9 | 195 | 844 | 4.000 | 881251954 | That Thing You Do! (1996) | 28-Sep-1996 | NaN | http://us.imdb.com/M/title-exact?That%20Thing%20You%20Do!%20(1996) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1996 | Comedy | Comedy | 49 | M | writer | 55105 |
| user_id | movie_id | rating | unix_timestamp | title | release_date | video_release_date | imdb_url | unknown | Action | Adventure | Animation | Children's | Comedy | Crime | Documentary | Drama | Fantasy | Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western | year | genre | all_genres | age | sex | occupation | zip_code | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 99990 | 872 | 288 | 2.000 | 891392577 | Evita (1996) | 25-Dec-1996 | NaN | http://us.imdb.com/M/title-exact?Evita%20(1996) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1996 | Drama | Drama-Musical | 48 | F | administrator | 33763 |
| 99991 | 872 | 291 | 5.000 | 891392177 | Rosewood (1997) | 21-Feb-1997 | NaN | http://us.imdb.com/M/title-exact?Rosewood%20(1997) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1997 | Drama | Drama | 48 | F | administrator | 33763 |
| 99992 | 872 | 268 | 2.000 | 891392092 | Full Monty, The (1997) | 01-Jan-1997 | NaN | http://us.imdb.com/M/title-exact?Full+Monty%2C+The+(1997) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1997 | Comedy | Comedy | 48 | F | administrator | 33763 |
| 99993 | 872 | 874 | 1.000 | 891392577 | She's So Lovely (1997) | 22-Aug-1997 | NaN | http://us.imdb.com/M/title-exact?She%27s+So+Lovely+(1997) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1997 | Romance | Drama-Romance | 48 | F | administrator | 33763 |
| 99994 | 872 | 299 | 4.000 | 891392238 | Air Force One (1997) | 01-Jan-1997 | NaN | http://us.imdb.com/M/title-exact?Air+Force+One+(1997) | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1997 | Thriller | Action-Thriller | 48 | F | administrator | 33763 |
| 99995 | 872 | 312 | 5.000 | 891392177 | Titanic (1997) | 01-Jan-1997 | NaN | http://us.imdb.com/M/title-exact?imdb-title-120338 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1997 | Drama | Action-Drama-Romance | 48 | F | administrator | 33763 |
| 99996 | 872 | 325 | 4.000 | 891392656 | G.I. Jane (1997) | 01-Jan-1997 | NaN | http://us.imdb.com/M/title-exact?G%2EI%2E+Jane+(1997) | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1997 | War | Action-Drama-War | 48 | F | administrator | 33763 |
| 99997 | 872 | 347 | 3.000 | 891392577 | Desperate Measures (1998) | 30-Jan-1998 | NaN | http://us.imdb.com/Title?Desperate+Measures+(1998) | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1998 | Thriller | Crime-Drama-Thriller | 48 | F | administrator | 33763 |
| 99998 | 872 | 357 | 2.000 | 891392698 | Spawn (1997) | 01-Aug-1997 | NaN | http://us.imdb.com/M/title-exact?Spawn+(1997/I) | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1997 | Adventure | Action-Adventure-Sci-Fi-Thriller | 48 | F | administrator | 33763 |
| 99999 | 872 | 341 | 4.000 | 891392698 | Man Who Knew Too Little, The (1997) | 01-Jan-1997 | NaN | http://us.imdb.com/M/title-exact?Man+Who+Knew+Too+Little%2C+The+(1997) | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1997 | Comedy | Comedy-Mystery | 48 | F | administrator | 33763 |